Goto

Collaborating Authors

 teaching pre-trained model


Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Neural Information Processing Systems

Evidence suggests that large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. Recently, it has been shown that Transformer-based models succeed in consistent reasoning over explicit symbolic facts, under a closed-world assumption. However, in an open-domain setup, it is desirable to tap into the vast reservoir of implicit knowledge already encoded in the parameters of pre-trained LMs. In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. To do this, we describe a procedure for automatically generating datasets that teach a model new reasoning skills, and demonstrate that models learn to effectively perform inference which involves implicit taxonomic and world knowledge, chaining and counting. Finally, we show that teaching models to reason generalizes beyond the training distribution: they successfully compose the usage of multiple reasoning skills in single examples. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.


Review for NeurIPS paper: Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Neural Information Processing Systems

Weaknesses: Overall, I really liked the experiments in the paper, but some of the analysis could be made more complete. In particular, I had the following questions about the experiments: 1. In the experiments of Section 4.1, a possible explanation for the model's performance could be that it's not due to implicit reasoning but due to the distractor subject leaking the answer: For example, given A mammal has a belly button and A whale eats fish, the model can infer that a whale has a belly button simply by combining these facts instead of doing implicit reasoning. One possible explanation for why the hypothesis-only results are poor is because the hypothesis-only conditions are only seen in 20% of the examples. What happens if the model is re-trained with only hypothesis information and labels, and then evaluated in the hypothesis only mode? 4. To isolate the effect of pre-trained representations, why do the authors choose to use a different architecture (ESIM) instead of using RoBERTa with randomly initialized weights? 5.


Review for NeurIPS paper: Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Neural Information Processing Systems

All 4 reviewers support acceptance for the contribution. I believe the contribution is original and intriguing enough to merit a spotlight. This summary from R4 shows how the work in this paper opens new possibilities in NLP, complementing powerful adaptable models such as GPT-3. "This paper shows that it is possible to adapt pretrained language models (LMs) on-the-fly based on natural language text in order to correct the model's behavior. When an LM would answer a question incorrectly, the authors supplement the model with a hint or relevant piece of evidence in the form of natural language text and find that the model is then able to produce the correct answer. This results are a proof of concept that large, black-box LMs can be adapted/corrected in a natural way / potentially by non-expert users of the system, simply by providing relevant natural language text."


Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Neural Information Processing Systems

Evidence suggests that large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. Recently, it has been shown that Transformer-based models succeed in consistent reasoning over explicit symbolic facts, under a "closed-world" assumption. However, in an open-domain setup, it is desirable to tap into the vast reservoir of implicit knowledge already encoded in the parameters of pre-trained LMs. In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. To do this, we describe a procedure for automatically generating datasets that teach a model new reasoning skills, and demonstrate that models learn to effectively perform inference which involves implicit taxonomic and world knowledge, chaining and counting. Finally, we show that "teaching" models to reason generalizes beyond the training distribution: they successfully compose the usage of multiple reasoning skills in single examples.